NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Accelerating AWP-ODC for Large-scale Earthquake Simulations Using MVAPICH2

Palla, A; Wang, S; Zhang, T; Cui, Y (August 2025, Annual MVAPICH User Group (MUG) Conference)

Accurate simulation of earthquake scenarios is essential for advancing seismic hazard analysis and risk mitigation strategies. At the San Diego Supercomputer Center (SDSC), our research focuses on optimizing the performance and reliability of large-scale earthquake simulations using the AWP-ODC software. By implementing GPU-aware MPI calls, we enable direct data processing within GPU memory, eliminating the need for explicit data transfers between CPU and GPU. This GPU-aware MPI achieves nearly ideal parallel efficiency at full scale across both Nvidia and AMD GPUs, leveraging the MVAPICH-PLUS support on Frontier at Oak Ridge National Laboratory and Vista at the Texas Advanced Computing Center. We utilized the MVAPICH-Plus 4.0 compiler to enable ZFP compression, which significantly enhances inter-node communication efficiency – a critical improvement given the communication bottleneck inherent in large-scale simulations. Our GPU-aware AWP-ODC versions include linear forward, topography and nonlinear Iwan-type solvers with discontinuous mesh support. On the Frontier system with MVAPICH 4.0, Hip-aware MPI calls on MI250X GPUs deliver nearly ideal weak-scaling speedup up to 8,192 nodes for both linear and topography versions. On TACC’s Vista system, CUDA-aware MPI calls on GH200 GPUs substantially outperform their non-GPU-aware counterparts across all three solver versions. This poster will present a detailed evaluation of GPU-aware AWP-ODC using MVAPICH, including the impact of ZFP message compression compared to the native versions. Our results highlight the importance of Mvapich support for GPU-ware MPI and on-the-fly compression techniques for accelerating and scaling earthquake simulations.
more » « less
Full Text Available
GPU-aware support to improve the performance of collective communication on GPUs

Cui, Y; Palla, A; Zhang, T; Wang, S; Roten, D; Koesterke, L; Zhang, Z; Maechling, P (September 2025, SCEC Publications)

We have implemented GPU-aware support across all AWP-ODC versions and enhanced message-passing collective communications for this memory-bound finite-difference solver. This provides cutting-edge communication support for production simulations on leadership-class computing facilities, including OLCF Frontier and TACC Vista. We achieved significant performance gains, reaching 37 sustained Petaflop/s and reducing time-to-solution by 17.2% using the GPU-aware feature on 8,192 Frontier nodes, or 65,336 MI250X GCDs. The AWP-ODC code has also been optimized for TACC Vista, an Arm-based NVIDIA GH200 Grace Hopper Superchip, demonstrating excellent application performance. This poster will showcase studies and GPU performance characteristics. We will discuss our verification of GPU-aware development and the use of high-performance MVAPICH libraries, including on-the-fly compression, on modern GPU clusters.
more » « less
Full Text Available
SAM-based Segmentation of Multi-Class Bridge Components from Diverse Real-Scene Inspection Images

Wang, S; Huang, Y; El-Gohary, N (May 2025, Purdue e-Pubs)

Full Text Available
A fast flood inundation model with groundwater interactions and hydraulic structures

https://doi.org/10.1016/j.advwatres.2025.105057

Sanders, BF; Schubert, JE; Martin, EMH; Wang, S; Sukop, MC; Mach, KJ (July 2025, Advances in Water Resources)

To efficiently predict flooding caused by intense rainfall (pluvial flooding), many physics-based flood inundation models adopt simplistic parameterizations of infiltration such as the Kostiakov, Horton, Soil Conservation Service and Green-Ampt methods. However, these methods are not explicitly dependent on soil moisture (or the groundwater table height), which is known to strongly influence the amount of runoff generated by rainfall. Models that fully couple surface and groundwater flow equations offer an alternative approach, but require larger amounts of input data and greater computational effort. Here we present a fast flood inundation model that couples two-dimensional shallow-water equations for surface flow with a zero-dimensional, time-dependent groundwater equation to capture sensitivity to groundwater. The model is also configured to account for storm drains, pumping and gates so human influences on flooding can be resolved, and is implemented with a dual-grid finite-volume scheme and with OpenACC directives for execution on graphical processing units (GPUs). With a 1.5 m resolution application across a 1,000 km area in Miami, Florida, where pluvial flooding is sensitive to depth to groundwater and simulation models that accurately reproduce observed flooding are needed to explore and plan response options, we first show that hourly water levels are predicted with a Mean Absolute Error of 8–16 cm across six canal gaging stations where flows are affected by tides, pumping, gate operations, and rainfall runoff. Second, we show high sensitivity of flooding to antecedent groundwater levels: flood extent is predicted to vary by a factor of six when initial depth to groundwater is varied between 10 and 200 cm, an amount that aligns with seasonal changes across the area. And third, we show that the model runs 30 times faster than real time (i.e., model speed = 30) using an NVIDIA V100 GPU. Furthermore, using a 3 m resolution model of Houston, Texas, we benchmark model speeds greater than 20 and 100 for domain sizes of 10,000 or 1,000 km2, respectively. The importance of model speed is discussed in the context of flood risk management and adaptation.
more » « less
Full Text Available
LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh

Wen, J; Schwing, AG; Wang, S (April 2025, Proc. ICLR)

Full Text Available
Statistical Learning of Distributionally Robust Stochastic Control in Continuous State Spaces

Wang, S; Si, N; Blanchet, J; Zhou, Z (May 2025, International Conference on Artificial Intelligence and Statistics)

Full Text Available
Chirality Effects on the Intrinsic Acidity of Isomeric Tripeptides Containing a D/L-Cysteine on the N-terminus: CAA and dCAA

https://doi.org/10.1016/j.ijms.2025.117472

Wang, S; Buen, Z; Harvey, K R; Zhang, Y; Ren, J (May 2025, International journal of mass spectrometry)
Laskin, J; Ouyang, Z (Ed.)
Chirality effects on the intrinsic gas-phase acidity of oligopeptides have been studied using a pair of stereoisomeric tripeptides consisting of a D/L-cysteine (C) and two residues of alanine (A): CAA and dCAA, where the C-terminus is amidated. Mass spectrometry measurements through bracketing via collision-induced dissociation clearly show that CAA is a stronger gas-phase acid than dCAA. Quantitative values of the acidity were determined using the extended Cooks kinetic method. The resulting deprotonation enthalpy (∆acidH) for CAA is 326.2 kcal/mol (1364.7 kJ/mol) and for dCAA it is 326.8 kcal/mol (1367.6 kJ/mol). The corresponding gas-phase acidity (∆acidG) for CAA is 321.3 kcal/mol (1344.2 kJ/mol) and for dCAA it is 322.0 kcal/mol (1347.3 kJ/mol). Changing the N-terminal cysteine from the L-form to the D-form reduces the gas-phase acidity by about 0.6 kcal/mol (2.5 kJ/mol). Extensive conformational searches followed by quantum chemical calculations at the ωB97X-D/6-311+G(d,p) level of theory yielded a set of lowest energy conformations for each peptide species. Theoretical gas-phase acidities calculated using the Boltzmann averaged conformational contributions are in good agreement with the experimental data. The shift in the acidity is likely due to the conformational effect induced by D-cysteine, which increases the stability of the neutral dCAA, and hence reduces its acidity. A chirality change on a single amino acid can have a noticeable effect on the biochemical properties of peptides and proteins.
more » « less
Full Text Available
Explore the Reasoning Capability of LLMs in the Chess Testbed

Wang, S; Ji, L; Wang, R; Zhao, W; Liu, H; Hou, Y; Wu, Y N (May 2025, Annual Conference of the Nations of the Americas Chapter of the ACL (NAACL))

Full Text Available
An Efficient High-dimensional Gradient Estimator for Stochastic Differential Equations

Wang, S; Blanchet, J; Glynn, P (December 2024, Advances in Neural Information Processing Systems)

Full Text Available
Enhancing interview protocols: topic modeling as a content validity technique

Madones, C; Lahoud, T; Tang, C; Yang, Y; Wang, S; Cohen, A; Wang, K; Orrill, C; Brown, R (April 2025, National Council on Measurement in Education)

Full Text Available

« Prev Next »

Search for: All records